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Abstract 

A solid-state nanopore can electrophoretically capture a DNA molecule and pull it through in a 
folded configuration. The resulting ionic current signal indicates where along its length the DNA 
was captured. A statistical study using an 8 nm wide nanopore reveals a strong bias favoring the 
capture of molecules near their ends. A theoretical model shows that bias to be a consequence of 
configurational entropy, rather than a search by the polymer for an energetically favorable config- 
uration. We also quantified the fluctuations and length-dependence of the speed of simultaneously 
translocating polymer segments from our study of folded DNA configurations. 



1 



A voltage-biased nanopore is a single-molecule detector that registers the disruption of /, 
the ionic current through the nanopore, caused by the insertion of a linear polyelectrolyte 
[IH3]. Most previous studies have focused on instances where the nanopore electrophoreti- 
cally captures DNA at one end and then slides it through in a linear, head-to-tail fashion. 
However, a ~ 10 nm-wide solid-state nanopore can also capture DNA some distance from 
its end and pull it through in a folded configuration |IH6]. Folded DNA translocations entail 
the simultaneous motion of multiple segments through the nanopore, which may exhibit 
cooperative behavior that alters the translocation dynamics [7]. The mechanical bending 
energy associated with folds may influence the capture of DNA [8 J . Importantly, the study 
of folded configurations provides snapshots of molecules at the moment of insertion, which 
offer clues about how the nanopore captures them from solution. The capture process is 
relevant to applications of nanopores that seek to extract sequence-related information from 
unfolded molecules. 

When DNA encounters a nanopore, the electrophoretic force can initiate translocation 
by inducing a hairpin fold in the molecule that protrudes into the nanopore. Two segments 
of DNA extend from the initial fold, a long one of length L\ and a short one of length L s 
(Fig. ija)). The capture location, x = L L + L[ , is the fractional contour distance from the 
initial fold to the nearest end. The time for each segment to translocate is measurable from 
the time trace of / [1H6] and can be used to estimate x. Storm et al. inferred the distri- 
bution of x for A DNA translocations and concluded that folds occur with equal probability 
everywhere along a molecule's length, but that the DNA is more likely to be captured at its 
ends because of the lower energetic cost of threading an unfolded molecule [6] . This implies 
that molecules test multiple configurations prior to capture, which is a statistical process 
governed by energetic considerations. By contrast, Chen et al. reported a bias for unfolded 
translocations that increased with applied voltage [5]. This finding implies that molecules 
pre-align in the fields outside the pore rather than sample multiple configurations prior to 
capture. No model for the distribution of x is available to help evaluate these competing 
pictures. 

Here, we present a study of DNA translocations of an 8 nm-wide solid-state nanopore 
which reveals a strongly biased distribution of capture locations, where the probability 
of capture increases continuously and rapidly towards the DNA's ends. The equilibrium 
distribution of polymer configurations outside the nanopore offers a natural explanation for 
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FIG. 1: a) A nanopore captures DNA from solution and initiates electrophoretic translocation by 
forming a hairpin. Segments of length L\ and L s extend from the capture location. (Detail) TEM 
image of the 8nm wide nanopore used, b) Ionic current traces from translocation events of type 
1, 2-1, and 2 indicate the capture location, c) The ionic current trace of a folded DNA molecule 
shows t2, itotj and ECD. 

this surprising finding. We present a simple but successful model of that distribution in 
which only the configurational entropy is important. Finally, we show that a constant mean 
translocation velocity and Gaussian velocity fluctuations explain the translocation dynamics 
of folded DNA well, but that a weak length-dependence of the mean segment velocity exists. 

The 8nm diameter solid-state nanopore we used (Fig.jlja), detail) was fabricated in 
a 20nm-thin low-stress silicon nitride membrane following procedures described elsewhere 
[9]. The nanopore bridged two fluid reservoirs containing degassed aqueous 1 M KC1, 10 
mM Tris-HCl, 1 mM EDTA buffer (pH 7.7). An electrometer (Axon Axopatch) applied 
100 mV across the nanopore and monitored / using two Ag/AgCl electrodes immersed in 
the reservoirs. A 10 kHz, 8-pole, low-pass Bessel filter conditioned / prior to digitization at 
50 kilo-samples per second. The open-pore current was / = 3.6 nA. After adding A DNA 
(16.5 /zm long, New England Biolabs) to the negatively charged reservoir at a concentration 
of 24/xg/mL, transient blockages in / were observed, such as the ones shown in Fig.[T^b). 

The blockages show quantized steps in / that indicate where the nanopore captured each 
molecule, as illustrated in Fig.PLTb). Unfolded molecules decreased / by « 0.278 nA for the 
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full duration of the translocation event, £ t ot- We call these "type 1" events. Folded molecules 
cause two segments to occupy the nanopore simultaneously, thereby doubling the reduction 
in I for a time Two segments occupied the nanopore for the full duration of "type 2" 
events, indicating molecules captured at the midpoint. A transition from double to single 
occupancy was observed in "type 2-1" events, indicating molecules captured somewhere 
between an end and the midpoint. Fig.[2^c) shows a type 2-1 event that illustrates t tot and 
t 2 ; we judged the occupancy of the nanopore to have changed when I rose or fell 80% of 
the way to the next blockage level. We also observed event types which indicate molecules 
captured and folded by the nanopore at multiple locations. For the present study, however, 
we restrict our attention to translocations with at most a single fold, which account for 
~ 70% of all events. 

We found evidence that a minority of the current blockages were caused by fragments of A 
DNA that we wish to exclude from further analysis. We considered the event charge deficit 
(ECD), which is the current blockage integrated over the duration of an event (illustrated 
in Fig.[l](c)). Fig.|2](a) plots the ECD distributions for events of type 1, 2-1 and 2. Most 
events fall into the main peaks that are centered at 0.408 ± 0.003 pC, regardless of the 
event type. We attribute those events to intact A-DNA molecules jl]. Minor peaks in the 
distributions near 0.15 pC likely correspond to fragments of those molecules. To obtain a 
monodisperse ensemble, we excluded all events with ECD < 0.27pC from further analysis. 
We also excluded six events with ECD> 3 pC, presumably caused by molecules that stuck 
to the nanopore. These restrictions leave us with an ensemble of ~ 1100 identical A DNA 
molecules that translocated with at most a single fold. 

For each translocation event, we obtained the capture location, x, by assuming that 
the translocation speed, v , was constant over the duration of the event, which follows the 
approach of Storm et al. [6] and gives: 



^2 + ^tot 

Below we shall investigate the accuracy of that assumption and explore the consequences of 
fluctuations and a contour length dependence in v. 

Figure[2^b) presents a histogram of the capture locations. We selected a bin size that 
avoids a possible artifact of the limited measurement bandwidth; since there is a lower bound 
on t 2 , it would be difficult to populate bins near x — if the bin size were too small. The 
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FIG. 2: a) Overlaid ECD distributions for translocations of type 1 (dark grey), 2-1 (white), and 
2 (medium grey). Events with ECD < 0.27 pC and six with ECD > 3 pC were dropped from 
subsequent analyses in order to exclude fragmented and stuck DNA molecules, respectively. Only 
OpC < ECD < lpC is plotted for clarity, b) Distribution of capture locations. The stacked 
histogram bars indicate the number of events of each type in a bin. Data points indicate the total 
number of events of all types and their mean x in a bin. Error bars indicate the square root of 
the total events. The distributions predicted by Eq.[3]are shown for the theoretical 7 = 0.70 (solid 
line) and for the weighted best fit 7 = 0.46 (dashed line). 



distribution shows that the frequency of capture was highest near x = 0, decreasing rapidly 
but smoothly with distance away from the ends, and becoming a slowly decreasing function 
of x near x = 0.5. The bin that includes x = 0.5 rises above the trend. 

We propose a physical model to explain the distribution of capture locations. We assume 
that a DNA molecule has enough time to sample all available configurations as it approaches 
the nanopore. At the moment of capture, the nanopore randomly selects a configuration 
from the equilibrium ensemble. We model that configuration as a pair of independent 
self-avoiding walks (SAWs) of lengths L s and Li, tethered to the surface at a single point 
representing the nanopore. We discuss these assumptions below. 
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For a single polymer, the total number SAWs of length L, £l(L), has the following asymp- 
totic form jlUj : 

Q(L) ~ ii L IS'- 1 . (2) 

7 is a universal scaling exponent which depends solely on the dimensionality of the lattice 
and \x is the lattice coordination number. Barber et al. studied SAWs tethered to a surface 
and obtained 7 ~ 0.70 from simulations on a cubic lattice jl lj . 

The number of configurations available to a molecule captured at x, is the product 

of the number of SAWs for each segment, 0(L 8 ) and fl(Lj). From L s + L\ = L and Eq.|2j it 
follows that Ql{x) — Q{L S ) ■ Q(L — L s ). The probability of capturing a molecule at x, P(x), 
is proportional to Ql(x), therefore we find: 

P{x) = Ax 7 " 1 ■ (1 - a;) 7-1 . (3) 

The solid line in Fig.[2^b) plots the distribution of capture locations predicted by Eq.[3]for 
7 = 0.7. The proportionality constant A was obtained from a weighted least squares fit to 
the data. By contrast, the best fit of Eq.[3]when 7 is left as a free parameter, indicated by 
the dashed line in Fig.J^b), obtains 7 = 0.46 ± 0.03. 

The two-tethered-polymer model describes the observed distribution of capture locations 
well. Note that the skewness arises naturally from configurational entropy alone; every DNA 
configuration is represented with equal probability and there is no need to invoke a bending 
energy, as Storm et al. did, to explain the preponderance of molecules captured near their 
ends [6|. The model disagrees most significantly with the data at x — 0.5, where more events 
were observed than predicted. That discrepancy can be explained by the translocation of 
circular A DNA molecules, whose complementary single-stranded ends had bound, resulting 
in extra type 2 events. An important implication of our model is that DNA does not search 
for an energetically favorable configuration before initiating a translocation. 

A question that our experiments cannot address is where, in relation to the nanopore, the 
capture location is determined. Within our model, x is determined at the nanopore; however, 
recent studies have identified a critical radius from the nanopore, typically on the scale of 
hundreds of nanometers, within which electrophoretic forces overwhelm diffusion [T2l IT3] . 
It is possible that the first segment to insert is transported essentially deterministically to 
the nanopore from some distance away without altering the distribution of x. Similarly, 
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our assumption that a DNA molecule is at equilibrium prior to capture is not seriously 
compromised if the molecule becomes stretched out of equilibrium by the field gradients only 
after the capture location has been determined. The forces on DNA beyond the nanopore 
may restrict the available configurations and thereby reduce 7. 

A third assumption of our model worth considering is that both segments of the captured 
polymer behave independently. In addition to undergoing self-avoiding walks, both segments 
should avoid one another. Theoretically, 7 decreases to ~ 0.60 when two segments of equal 
length are tethered to the same point on a surface [13]. 
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FIG. 3: Dependence of (itot) on fa). Error bars indicate the standard deviation of the mean in a 
80 /xs bin. Bins with fa) > 1ms contain an insignificant number of events (< 2). The solid line 
shows the predictions of the dynamical model that includes velocity fluctuations described in text. 
The dashed line accounts for the length-dependence of the translocation speed of each segment 
with t oc L a . The scaling exponent a = 1.19 ± 0.04 was obtained from a weighted least squares fit 
to the data in the range fa) < 0.7 ms. 

We next turn to the translocation dynamics of folded molecules. We estimated x for 
each event by assuming that both segments translocated at the same speed; however, that 
assumption ignores fluctuations in the speed and any dependence on the length of a segment, 
which are both established features of unfolded DNA translocations [151 H5]- 111 order to 
investigate our assumption in more detail, we divided the translocation data into 80 /xs bins 
of ti- For each bin, (£ to t) an d its standard deviation were calculated and plotted against fa) 
(Fig.[3]). (Q) denotes the mean of quantity Q in a 80 /xs bin. If both segments translocated 
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at the same speed, we would expect (ttot) to decrease in proportion with any increase in (t 2 ). 
Fig.[3] shows that (t to t) in fact decreased approximately linearly with (t 2 ) until (t 2 ) ~ 0.7ms, 
where (ttot) began to rise. That turning point coincides approximately with the mean 
translocation time for type 2 events. 

The upswing in (ttot) with (t 2 ) is the result of fluctuations in the translocation speed, as 
the following dynamical model illustrates. Consider a folded molecule whose two segments 
translocate with the same Gaussian distribution of speeds, G V0)/ \ v (v). v is mean transloca- 
tion speed and Av is the standard deviation, which accounts for fluctuations. Accordingly, 
if a segment translocates in a time t 2 , the probability that its length was between L s and 
L s + dL s is given by: 

P (L s | t 2 ) dL s oc G VoA v (^) -J 1 - ( 4 ) 

The probability distribution P (L s | t 2 ) dL s is normalized by integrating over L s from to L. 
The complementary segment has length L; = L — L s . The probability that it takes between 
t to t and t t ot + ^ttot to translocate is: 

P (t^ I L s ) dt 

tot 0^ Gvo,Avq ( — 7 ) — 72 <^tot- (5) 

V Hot / ttot 

Combining Eqs.[4]and[5j we find that when one segment translocates in a time t 2 , the com- 
plementary segment will translocate in a time between t to t and t tot + dt tot with a probability 
given by: 

P (ttot I t 2 ) dt t ot cc P (t to t | L s ) P {L s I t 2 ) dL^j dt tot . (6) 

The distribution P (t tot | t 2 ) is normalized by integrating over t to t from t 2 to oo. A least 
squares fit of Eq.[6]to the data in the first bin of Fig.[3] ((t 2 ) = 0.015) obtains vq = 10.76 ± 
0.06 mm/s and Av/v = 0.198 ± 0.005. With those parameters and Eq.[6j we calculated 
(ttot) as a function of t 2 and plotted the results in Fig.|3| The predicted relationship agrees 
well with the data. 

Importantly, the dynamical model demonstrates the robustness of our method for ob- 
taining the distribution of x in Fig.J^b). Fluctuations lead to errors in estimating x for a 
particular event, as one segment may translocate faster or slower than the other; however, 
the relationship between t to t and t 2 is the same on average as if there were no fluctuations. 
Events with t 2 > 0.7ms are drawn from tails of the speed distributions; (ttot) rises with 
(t 2 ) because both segments of molecules captured at x ~ 0.5 translocated more slowly than 
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average, not because the segments translocated at different speeds on average. Accordingly, 
we found x ~ 0.5 for those events. 

Finally, the slope of the data in Fig.[3]for (t 2 ) < 0.7 ms reveals a weak dependence of the 
translocation speed on the length of a segment. Long molecules are known to translocate 
more slowly than short ones in unfolded configurations [15] because the moving segment is 
longer and experiences more viscous drag when it is drawn to the nanopore from a large 
coil [I6l [T7] . Storm et al. assumed a power law relationship between the translocation time 
and the length of unfolded DNA, t ~ L a , and found that the scaling exponent a = 1.27 
[5J. Assuming that each segment of a folded molecule obeys a similar scaling relationship 
and using L s + Li = L, we find that t tot = (t 1 ^ 01 — t^ a ) a , where t\ is the translocation time 
of unfolded molecules. We fitted that expression to the data in Fig.[3]for (£ 2 ) < 0.7ms to 
obtain a = 1.19 ± 0.04. Accounting for the length-dependent speed in estimating x skews 
the distribution, raising the best fit exponent to 7 = 0.72 ± 0.02, which is closer to the 
theoretical value. 

In conclusion, we measured the distribution of capture locations along A DNA molecules 
by an 8nm wide solid-state nanopore and presented a theoretical model which explains 
that distribution. Surprisingly, the strong bias for capturing molecules near their ends is a 
consequence of the configurational entropy of the approaching polymer; molecules do not 
search for an energetically favorable configuration before translocating. We also used folded 
DNA configurations to probe the dynamics of multiple polymer segments translocating a 
nanopore simultaneously, thereby quantifying the fluctuations and the length dependence of 
the translocation speed. 
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